Prediction of flare following remission and treatment withdrawal in early rheumatoid arthritis: post hoc analysis of a phase IIIb trial with abatacept

Background Drug-free remission is a desirable goal in rheumatoid arthritis (RA) for both patients and clinicians. The aim of this post hoc analysis was to investigate whether clinical and magnetic resonance imaging (MRI) variables in patients with early RA who achieved remission with methotrexate and/or abatacept at 12 months could predict disease flare following treatment withdrawal. Methods In the AVERT study of abatacept in early RA, patients with low disease activity at month 12 entered a 12-month period with all treatment discontinued (withdrawal, WD). This post hoc analysis assessed predictors of disease flare at WD+6months (mo) and WD+12mo of patients with Disease Activity Score in 28 joints (DAS28)-defined remission (DAS28[C-reactive protein (CRP)] <2.6) at withdrawal using univariate and multivariable regression models. Predictors investigated included the Health Assessment Questionnaire–Disability Index (HAQ-DI), pain, Patient Global Assessment; MRI synovitis, erosion, bone edema, and combined (synovitis + bone edema) inflammation scores. Results Remission was achieved by 172 patients; 100 (58%) and 113 (66%) patients had experienced a flare at WD+6mo and WD+12mo, respectively. In univariate analyses, higher HAQ-DI and MRI synovitis, erosion, bone edema, and combined inflammation scores at WD were identified as potential predictors of flare (P ≤ 0.01). In multivariable analysis, high scores at WD for HAQ-DI and MRI erosion were confirmed as independent predictors of flare at WD+6mo and WD+12mo (P < 0.01). Conclusion In patients with early RA achieving clinical remission, patient function (HAQ-DI), and MRI measures of bone damage (erosion) predicted disease flare 6 and 12 months after treatment withdrawal. These variables may help identify patients with early RA in clinical remission as candidates for successful treatment withdrawal. Trial registration ClinicalTrials.gov, NCT01142726 (date of registration: June 11, 2010) Supplementary Information The online version contains supplementary material available at 10.1186/s13075-022-02735-8.

RA is considerable and includes pain, fatigue, reduced quality of life, and substantial socioeconomic costs [1,2].
Early treatment of RA with disease-modifying antirheumatic drugs (DMARDs) is recommended to reduce inflammation, relieve symptoms, and halt or minimize structural progression that may lead to disability [3][4][5]. A treat-to-target approach [6] has been widely adopted by physicians with the aim of achieving remission or, if not possible, low disease activity through close monitoring, medication adjustment, and the use of biologic (b) DMARDs when indicated.
Drug-free remission is a highly desirable goal for both patients and physicians. Although the tapering or discontinuation of bDMARDs is often recommended in patients with sustained remission [4], complete withdrawal of RA therapy may be possible in some patients without inducing disease flares. Modern imaging techniques, soluble biomarkers, and physician/patient-reported measures offer the potential to predict such flares. Ultrasound has been identified as a possible clinically applicable predictive tool for flares but in relatively small, non-randomized studies following the tapering or discontinuation of bDMARDs [7][8][9]. Data for biomarkers as predictive tools are conflicting [10] and data for physician-and patientreported measures are lacking.
Magnetic resonance imaging (MRI) has been used to assess the severity of joint damage and inflammation as well as response to treatment in clinical trials and realworld practice. Studies have shown correlations between reduction in MRI-assessed inflammation and reduced joint damage [11,12] and have also demonstrated that MRI scores above (or below) a specific cut-off may be predictive of radiographic progression and/or low disease activity in patients with RA [13][14][15][16]. Identification of factors that predict flare could assist in determining which patients are suitable candidates for complete treatment withdrawal and aid individualized treatment decisions.
The T-cell costimulatory modulator, abatacept, approved for treatment of RA, halts the production of autoantibodies and proinflammatory cytokines by interrupting the cycle of T-cell activation initiated in RA. The Assessing Very Early Rheumatoid arthritis Treatment (AVERT) study of patients with early, active RA demonstrated the proportion of patients with Disease Activity Score in 28 joints (C-reactive protein) (DAS28[CRP])defined remission (DAS28[CRP] <2.6) was significantly higher following 12 months of treatment with abatacept plus methotrexate (MTX) versus MTX alone [17]. Additionally, a significantly higher proportion of patients treated with abatacept plus MTX versus MTX alone maintained drug-free remission for 6 months after withdrawal of all RA treatment [17]. Furthermore, the majority of patients experienced a disease flare within 6 months of treatment withdrawal and few patients sustained major responses for 1 year [18].
The objective of this post hoc analysis of the AVERT study was to investigate whether specific patient and disease characteristics, including MRI findings, of patients in DAS28(CRP)-defined remission at 12 months could be used to predict disease flare following treatment withdrawal of abatacept plus MTX, abatacept monotherapy, or MTX alone. Predefined cut-offs in patient-reported outcome (PRO) and MRI scores based on earlier literature were evaluated as predictors of flare.

Study design and patient population
This was a post hoc analysis of the AVERT (NCT01142726, June 11, 2010) study [17]. AVERT was a phase IIIb, randomized, active-controlled 24-month study in adult patients with early (≤2 years), active RA consisting of a 12-month double-blind treatment period and a subsequent treatment withdrawal period (see Supplementary Fig. 1 in Additional File 1). All patients in AVERT satisfied the 2010 American College of Rheumatology/European League Against Rheumatism classification criteria for RA [17,19] and were anti-cyclic citrullinated peptide (anti-CCP) positive. Details of sample size, power considerations, and methods for primary and secondary analyses in AVERT have previously been reported [17]. All patients who discontinued prior to completing the treatment or withdrawal period were imputed as non-responders for the month 12 or 18 analyses [17]. Patients enrolled in AVERT were MTX-naive or received MTX (≤10 mg/week) for ≤4 weeks with no MTX for 1 month prior to enrolment [17]. Patients were randomized to weekly subcutaneous abatacept 125 mg plus MTX (n = 119), abatacept 125 mg plus placebo (n = 116), or MTX plus placebo (n = 116) on day 1. MTX was initiated at 7.5 mg/week and titrated to 15-20 mg/week within 6-8 weeks [17].
For inclusion in the post hoc analysis, patients were required to have achieved DAS28(CRP)-defined remission [20] at month 12 and to have entered a subsequent 12-month withdrawal period in which all treatment was discontinued. Data from the three treatment arms in the double-blind period were pooled to increase the sample size for this analysis since it was believed that predictors of flare after treatment withdrawal should be independent of treatments used to achieve remission before drug discontinuation. This study was carried out in accordance with the Declaration of Helsinki. The AVERT study protocol was approved by the Institutional Review Board or Independent Ethics Committee at each site [17]. All study participants provided informed consent for involvement in the study.

Assessment of association between WD clinical and MRI variables of interest and flare status at WD+6mo and WD+12mo
The association between demographic, clinical, and MRI variables at WD and subsequent flares at WD+6mo and WD+12mo was evaluated (see Supplementary Fig. 1 in Additional File 1). WD variables were analyzed as continuous measures and dichotomous variables of interest by using predefined cutoffs. A HAQ-DI cut-off score of >0.5 was used as an indicator of impaired physical function as a HAQ-DI score ≤0.5 has previously been considered an indicator of good physical function [22]. In line with Boolean criteria for remission, a cut-off score of >10 for patient pain and Patient Global Assessment scores (both VAS 0-100 mm scales) was used to indicate lack of remission status (based on study median values) [22]. Based on thresholds predictive of radiographic progression in previous studies, the following MRI cut-off scores were used: synovitis >3 [13], erosion >2 [16], bone edema >3 [13], unweighted combined inflammation >3 [16], and weighted combined inflammation >9 [13].

Statistical analyses
WD patient demographic and disease characteristics stratified by flare status at WD+6mo and WD+12mo were described. Differences between patient and disease characteristics at WD in patients with and without flare at WD+6mo and WD+12mo were estimated using a Student's t-test for equality of means (continuous variables) or a chi-square test (categorical variables). No correction for multiple testing was performed. Data from WD (or AVERT study baseline for age, weight, and duration of RA) were standardized to have a mean equal to zero and a standard deviation (SD) equal to one and were compared by the estimated differences between the flare versus no flare groups. P values of comparison were calculated by performing a Student's t test. To assess the relationship of disease characteristics and MRI scores at WD with flare status at WD+6mo and WD+12mo, data from WD were standardized to have a mean equal to zero and an SD equal to one. Odds ratios (ORs) and P values were calculated from a logistic regression model for PROs of interest (HAQ-DI, patient pain, Patient Global Assessment), MRI measures, and DAS28(CRP) (to rule out any association with flare for patients in DAS28[CRP]defined remission) by flare status. The scores at WD were the independent variables and flare at WD+6mo and WD+12mo were dependent variables. Statistical significance was set at P < 0.05. Furthermore, the proportion of patients who experienced a flare or no flare at WD+6mo and WD+12mo, stratified by prespecified cut-off scores for PRO and MRI scores, was determined. Univariate logistic regression models were conducted for comparisons of flare rates above and below the predefined PRO and MRI cut-off scores to determine ORs with 95% confidence intervals (CIs) and associated P values. Finally, a multivariable logistic regression model, adjusted for treatment arm, was used to determine whether PRO and MRI scores at WD were independent predictors of flare at WD+6mo and WD+12mo.

Clinical and MRI variables at WD in patients who experienced a flare versus patients who did not experience a flare at WD+6mo and WD+12mo
WD patient demographic characteristics were generally well-balanced across patients stratified by flare status at   (Table 1). HAQ-DI scores were significantly higher at WD for patients who experienced a flare compared to patients who did not experience a flare both at WD+6mo and WD+12mo (P = 0.0088 and 0.0095, respectively). Statistically significant differences in all MRI measures were observed for patients who experienced a flare compared to patients who did not experience a flare at WD+6mo (P ≤ 0.01; Table 1) and most MRI measures at WD+12mo (P ≤ 0.01 for all measures, except synovitis where P = 0.0107; Table 1). For patients who had experienced a flare by WD+6mo or WD+12mo (compared to no flare), the standardized estimated differences in HAQ-DI score and all MRI measures at WD were statistically significant (P < 0.02 for all; Fig. 1).

Dichotomized clinical and MRI variables at WD and their relationship with subsequent flare versus no flare at WD+6mo and WD+12mo
Patients were dichotomized according to predefined cut-off scores of PRO and MRI variables at WD. The association with flare at WD+6mo and WD+12mo based on these stratifications is shown in Supplementary Fig. 2 in Additional File 1. At both WD+6mo and WD+12mo, a higher proportion of patients with a HAQ-DI score of >0.5 experienced a flare compared with those who had a score of ≤0.5 (81% versus 51% and 88% versus 60%, respectively (Supplementary Fig. 2A in Additional File 1)). The difference in the proportion of patients experiencing a flare with pain scores above (versus below) the predefined cut-off score (10), was less pronounced (WD+6mo: 64% versus 55%; WD+12mo: 72% versus 63%), and this was also true for the Patient Global Assessment scores.
For all MRI measures, higher proportions of patients with scores above the predefined cut-offs experienced a flare than patients with scores below the predefined cut-offs ( Supplementary Fig. 2B in Additional File 1). For example, 86% (n/N = 25/29) of patients who had MRI weighted combined inflammation scores above the predefined cut-off of 9 experienced a flare at WD+6mo, while 53% (n/N = 67/126) with scores below the cut-off experienced a flare.
Furthermore, univariate logistic regression analysis was used to assess the relationship between dichotomized PRO and MRI variables at WD and flare status at WD+6mo and WD+12mo. Above-cut-off scores for HAQ-DI and most MRI variables at WD were significantly associated with flare at WD+6mo and/ or WD+12mo ( Supplementary Fig. 3 Fig. 3). Patient Global Assessment scores were only independently associated with flare at WD+12mo (OR 0.32 [0.10, 0.99], P = 0.0483), while pain was not independently associated with flare at either timepoint.

Discussion
In this post hoc analysis of the AVERT study in patients with early, active RA, we identified predictors of disease flare in patients who discontinued all RA treatment after achieving DAS28(CRP)-defined remission at month 12. In multivariable analysis, HAQ-DI (physical function) and MRI-detected erosion (bone damage) scores at WD were found to be independent predictors of disease flare at WD+6mo and WD+12mo. MRI-detected weighted combined inflammation (incorporating synovitis and 2x bone edema) showed a trend towards independently predicting disease flare at WD+6mo and WD+12mo. These observations suggest that these measures may help guide physicians to make decisions with regard to drug withdrawal after remission is achieved in patients with RA treated with abatacept.
In RA, treatment withdrawal following the achievement of remission without subsequent disease flare (i.e., sustained drug-free remission) is a highly desirable goal [23]. Tools that could be incorporated into routine clinical practice to help characterize patients for whom sustained remission is more likely, or who are at a higher risk of flare, may help to guide treatment decisions. Recent studies have highlighted potential predictors of flare following treatment tapering or discontinuation in patients with long-standing RA and sustained remission, with several studies focusing on the utility of power Doppler ultrasound (PDUS) and MRI measures [7,8,13,16]. WD+12mo. Data from WD were standardized (mean equal to zero and SD equal to one). Vertical line indicates limit of effect: positive data indicate effect, negative data or data that cross 1 indicate absence of effect. ORs (per one unit) and P values are from a univariate logistic regression model with scores at WD as the independent variables and flare at WD+6mo and WD+12mo as the dependent variable; bold P values indicate statistical significance. *HAQ-DI: n = 94 for flare at WD+6mo and n = 65 at WD+12mo; pain: n = 94 for flare at WD+6mo and n = 66 at WD+12mo; MRI: n = 92 for flare at WD+6mo and n = 63 at WD+12mo. † Synovitis score + edema score. ‡ Synovitis score + 2x edema score. § HAQ-DI: n = 107 for flare at WD+6mo and n = 52 at WD+12mo; pain: n = 107 for flare at WD+6mo and n = 52 at WD+12mo; MRI: n = 103 for flare at WD+6mo and n = 52 at WD+12mo. OR odds ratio. See Fig. 1

for other definitions
Patients in the AVERT study had early disease and all RA treatment (including MTX and corticosteroids) was withdrawn in those with a low disease activity score after 12 months. The AVERT study demonstrated that a significantly higher proportion of patients treated with abatacept plus MTX, versus MTX alone, maintained drug-free remission for 6 months after the withdrawal of all RA treatment [17]. Additionally, baseline corticosteroid use and Patient Global Assessment scores were found to be predictive of a shorter time to RA flare after treatment withdrawal and for the achievement of DAS28(CRP)defined remission after 6 months of retreatment with abatacept plus MTX, respectively [18]. Despite seropositivity being linked to predicting better efficacy of abatacept in the AMPLE (Abatacept versus adaliMumab comParison in bioLogic-naivE rheumatoid arthritis subjects with background MTX) study [24], in the AVERT study, there was no link between withdrawal of abatacept and increased risk of flare in patients with anti-CCP positive RA [18]. In contrast to the previous AVERT study analysis, which explored whether clinical characteristics were associated with time to disease flare or with regaining disease control after treatment [18], the present analysis assessed which clinical characteristics were associated with flare after treatment withdrawal. We found several clinical characteristics to be associated with flare in univariate analyses: HAQ-DI, and MRI synovitis, erosion, bone edema, and weighted and unweighted combined inflammation scores. A previous analysis from the GO-BEFORE trial of bDMARD-naive patients with RA treated with tumor necrosis factor inhibitor (TNFi) therapy and/or MTX found that MRI synovitis, bone edema, and erosion independently correlated with physical function, pain, and Patient Global Assessment scores [25]. The current analysis found that both increased HAQ-DI scores (impaired physical function) and higher levels of MRI findings (inflammation or structural damage) were independently predictive of disease flare after treatment withdrawal. However, a previous study of RA treatment discontinuation after the achievement of remission in patients with recent-onset RA receiving conventional synthetic DMARDs identified low baseline HAQ-DI scores as a predictor for restarting treatment [26].
Studies have also shown the utility of synovitis scoring measured by PDUS for predicting the failure of bDMARD tapering and the identification of suitable patients for treatment tapering or discontinuation after the achievement of sustained remission with TNFi therapy [7,8]. As in the present study, no association between demographic variables and subsequent disease relapse was found [8]. However, another study found no association between PDUS and flare following TNFi discontinuation [27]; the latter study reported that TNFi treatment initiation early in the disease course was the main predictor of successful discontinuation [27].
The ability of MRI to detect subclinical joint inflammation [13,[28][29][30][31][32] may explain our observation that MRI, but not laboratory measures of disease activity such as CRP or clinical measures such as SJC (28) or TJC (28), predicted risk of flare. As more data on predictors of flare after treatment taper or withdrawal are collected, Fig. 3 Multivariable logistic regression analysis assessing the value of cut-off scores for predicting flare. Analysis was performed for flare status at WD+6mo and WD+12mo. A multivariable logistic regression model, adjusted for treatment arm, determined whether patient-reported outcome (PRO) and MRI measures at WD were independent predictors of flare at WD+6mo and WD+12mo. P values in bold type indicate statistical significance. Vertical line indicates limit of effect: positive data indicate effect, negative data or data that cross 1 indicate absence of effect. *Synovitis score + (2x bone edema score). MTX methotrexate, SC subcutaneous. See Fig. 1 for other definitions a combination of clinical and imaging factors may be defined for the accurate identification of patients suitable for treatment withdrawal or those who would be at risk of flare. The costs of performing an MRI scan for this purpose would need to be balanced against potential savings in bDMARD usage [33] and the potential to spare patients unnecessary treatment.
The second stage of the present analysis was to test previously defined cut-off scores for their value in predicting flare. The cut-off scores for HAQ-DI (>0.5), pain, and Patient Global Assessment (both >10) tested in the current analysis were chosen based on prior evidence demonstrating these to be indicators of good physical function and Boolean remission [22]. Cut-off scores to test for MRI measures were chosen from two separate analyses. Baker et al. previously defined and validated thresholds of MRI synovitis and bone edema associated with low risk of radiographic progression in a subanalysis of data from randomized clinical trials of the TNFi golimumab in patients with RA (GO-BEFORE and GO-FORWARD studies) [13]; a cut-off score of ≤3 for MRI synovitis and bone edema was shown to identify patients at low risk of progression. In addition, a cut-off score of ≤9 for an MRI weighted combined inflammation score (synovitis score + 2x edema score) also identified patients with a very low risk of radiographic progression [13]. The cut-off scores for MRI synovitis and unweighted combined inflammation were developed by Brahe and colleagues during a dose-tapering study of patients with RA being treated with bDMARDs (the Danish A Dose OPTimization of biological therapy [ADOPT] study) [16]. As part of that analysis, receiver operator characteristic curves were generated to identify cut-off values for baseline variables. The exploratory analysis showed that a cutoff score of ≤2 for MRI erosions and ≤3 for MRI combined inflammation could be used to predict successful tapering of therapy for patients in sustained remission [16]. It should be noted that some of the MRI cut-off scores described above were based on predicting radiographic progression, whereas in the current analysis the cut-off scores were used to predict disease flares. The identification of important thresholds below which the safe withdrawal of effective treatments may be achieved is an important step forward in the precision use of therapies for RA.
Univariate analyses in the present study showed HAQ-DI and MRI synovitis, bone edema, erosion, and weighted and unweighted combined inflammation scores to be significantly predictive of flare 6 and 12 months following treatment withdrawal. Following multivariable analysis, we found HAQ-DI and MRI erosion scores to be predictors of disease flare at both 6 and 12 months following treatment withdrawal, while weighted combined inflammation showed a trend towards independently predicting disease flare. The finding that bone erosion was a predictor of disease flare in addition to inflammatory measures may indicate that those with RA-specific damage are also at higher risk of disease flare, perhaps related to a more severe disease phenotype. A recent post hoc analysis of the 2-year Danish IMAGINE-RA clinical trial (n = 171) showed baseline MRI osteitis (bone edema) and tenosynovitis to be independent predictors of 2-year MRI damage progression in patients with RA in clinical remission [34]. This further highlights the potential of MRI measures to guide an individualized approach to the management of RA.
Potential limitations of this study include the post hoc nature of the evaluation. The patient sample represented only a subgroup of the whole study population of AVERT and numbers were relatively small; thus, data should be interpreted with caution, as it may not be generalizable to other patient subgroups, different treatments, or the general RA population. Additionally, all patient data across the three treatment arms were pooled rather than stratified by treatment to provide a larger data set. As this study was conducted in patients with early RA, future studies will be needed to confirm whether the cut-off scores tested here would also predict disease flare in other RA populations or following the withdrawal or tapering of bDMARDs other than abatacept. There are several different definitions of flare (or relapse) in RA and, as such, results may vary slightly depending on which definition is used (and consequently which patients were included) [35][36][37].
Despite limitations, the current post hoc analysis had the strength of using data from a 2-year clinical trial comprising a 12-month treatment period followed by 12-month withdrawal period, in which patients were closely and systematically monitored. Furthermore, in AVERT, the withdrawal of all RA therapy (abatacept, background MTX, and glucocorticoids) allowed for the study of true drug-free remission. Finally, the testing of cut-offs for HAQ-DI, pain, Patient Global Assessment, and MRI measures provides a sense of how these measures may be used clinically to guide decisions surrounding WD or tapering of therapy in RA.

Conclusions
In summary, physical function (HAQ-DI) and objective MRI measures of inflammation and damage (erosion) at treatment withdrawal were independent predictors of flare 6 and 12 months after cessation of treatment with abatacept in patients with early RA in DAS28(CRP)defined remission. Cut-off scores of these variables were independent predictors of flare and may have the